Linear discriminant analysis-based estimation of the false discovery rate for phosphopeptide identifications.

نویسندگان

  • Xiuxia Du
  • Feng Yang
  • Nathan P Manes
  • David L Stenoien
  • Matthew E Monroe
  • Joshua N Adkins
  • David J States
  • Samuel O Purvine
  • David G Camp
  • Richard D Smith
چکیده

The development of liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS) has made it possible to characterize phosphopeptides in an increasingly large-scale and high-throughput fashion. However, extracting confident phosphopeptide identifications from the resulting large data sets in a similar high-throughput fashion remains difficult, as does rigorously estimating the false discovery rate (FDR) of a set of phosphopeptide identifications. This article describes a data analysis pipeline designed to address these issues. The first step is to reanalyze phosphopeptide identifications that contain ambiguous assignments for the incorporated phosphate(s) to determine the most likely arrangement of the phosphate(s). The next step is to employ an expectation maximization algorithm to estimate the joint distribution of the peptide scores. A linear discriminant analysis is then performed to determine how to optimally combine peptide scores (in this case, from SEQUEST) into a discriminant score that possesses the maximum discriminating power. Based on this discriminant score, the p- and q-values for each phosphopeptide identification are calculated, and the phosphopeptide identification FDR is then estimated. This data analysis approach was applied to data from a study of irradiated human skin fibroblasts to provide a robust estimate of FDR for phosphopeptides. The Phosphopeptide FDR Estimator software is freely available for download at http://ncrr.pnl.gov/software/.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Transferred subgroup false discovery rate for rare post-translational modifications detected by mass spectrometry.

In shotgun proteomics, high-throughput mass spectrometry experiments and the subsequent data analysis produce thousands to millions of hypothetical peptide identifications. The common way to estimate the false discovery rate (FDR) of peptide identifications is the target-decoy database search strategy, which is efficient and accurate for large datasets. However, the legitimacy of the target-dec...

متن کامل

Decoy-free protein-level false discovery rate estimation

MOTIVATION Statistical validation of protein identifications is an important issue in shotgun proteomics. The false discovery rate (FDR) is a powerful statistical tool for evaluating the protein identification result. Several research efforts have been made for FDR estimation at the protein level. However, there are still certain drawbacks in the existing FDR estimation methods based on the tar...

متن کامل

Improved False Discovery Rate Estimation Procedure for Shotgun Proteomics

Interpreting the potentially vast number of hypotheses generated by a shotgun proteomics experiment requires a valid and accurate procedure for assigning statistical confidence estimates to identified tandem mass spectra. Despite the crucial role such procedures play in most high-throughput proteomics experiments, the scientific literature has not reached a consensus about the best confidence e...

متن کامل

Comparison of some chemometric tools for metabonomics biomarker identification

NMR-based metabonomics discovery approaches require statistical methods to extract, from large and complex spectral databases, biomarkers or biologically significant variables that best represent defined biological conditions. This paper explores the respective effectiveness of six multivariate methods: multiple hypotheses testing, supervised extensions of principal (PCA) and independent compon...

متن کامل

Confident and sensitive phosphoproteomics using combinations of collision induced dissociation and electron transfer dissociation☆

UNLABELLED We present a workflow using an ETD-optimised version of Mascot Percolator and a modified version of SLoMo (turbo-SLoMo) for analysis of phosphoproteomic data. We have benchmarked this against several database searching algorithms and phosphorylation site localisation tools and show that it offers highly sensitive and confident phosphopeptide identification and site assignment with PS...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of proteome research

دوره 7 6  شماره 

صفحات  -

تاریخ انتشار 2008